Optimizing the Chase: Scalable Data Integration under Constraints
نویسندگان
چکیده
We are interested in scalable data integration and data exchange under constraints/dependencies. In data exchange the problem is how to materialize a target database instance, satisfying the source-totarget and target dependencies, that provides the certain answers. In data integration, the problem is how to rewrite a query over the target schema into a query over the source schemas that provides the certain answers. In both these problems we make use of the chase algorithm, the main tool to reason with dependencies. Our first contribution is to introduce the frugal chase, which produces smaller universal solutions than the standard chase, still remaining polynomial in data complexity. Our second contribution is to use the frugal chase to scale up query answering using views under LAV weakly acyclic target constraints, a useful language capturing RDF/S. The latter problem can be reduced to query rewriting using views without constraints by chasing the source-to-target mappings with the target constraints. We construct a compact graph-based representation of the mappings and the constraints and develop an efficient algorithm to run the frugal chase on this representation. We show experimentally that our approach scales to large problems, speeding up the compilation of the dependencies into the mappings by close to 2 and 3 orders of magnitude, compared to the standard and the core chase, respectively. Compared to the standard chase, we improve online query rewriting time by a factor of 3, while producing equivalent, but smaller, rewritings of the original query.
منابع مشابه
Chasing Constrained
We investigate the implication problem for constrained tuple-generating dependencies (CTGDs), the extension of tuple-and equality-generating dependencies that permits expression of semantic relations (constraints) on variables. The implication problem is central to identifying redundant integrity constraints, checking integrity constraints on constraint databases, detecting the independence of ...
متن کاملReliability optimization problems with multiple constraints under fuzziness
In reliability optimization problems diverse situation occurs due to which it is not always possible to get relevant precision in system reliability. The imprecision in data can often be represented by triangular fuzzy numbers. In this manuscript, we have considered different fuzzy environment for reliability optimization problem of redundancy. We formulate a redundancy allocation problem for a...
متن کاملOptimizing Cluster Heads for Energy Efficiency in Large-Scale Heterogeneous Wireless Sensor Networks
Many complex sensor network applications require deploying a large number of inexpensive and small sensors in a vast geographical region to achieve quality through quantity. Hierarchical clustering is generally considered as an efficient and scalable way to facilitate the management and operation of such large-scale networks and minimize the total energy consumption for prolonged lifetime. Judi...
متن کاملSemantic Constraint and QoS-Aware Large-Scale Web Service Composition
Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requ...
متن کاملStop the Chase
The chase procedure, an algorithm proposed 25+ years ago to fix constraint violations in database instances, has been successfully applied in a variety of contexts, such as query optimization, data exchange, and data integration. Its practicability, however, is limited by the fact that – for an arbitrary set of constraints – it might not terminate; even worse, chase termination is an undecidabl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 7 شماره
صفحات -
تاریخ انتشار 2014